346 PART 6 Analyzing Survival Data

Some points to keep in mind:»

» If your software outputs a zero-based baseline survival function, you don’t

subtract the average value from the patient’s value. Instead, calculate the v

term as the product of the patient’s predictor value multiplied by the regres-

sion coefficient.»

» If a predictor is a categorical variable, you have to code the levels as numbers.

If you have a dichotomous variable like pregnancy status, you could code not

pregnant = 0 and pregnant = 1. Then, if in a sample only including women,

47.2 percent of the sample is pregnant, the average pregnancy status is 0.472. If

the patient is not pregnant, the subtraction in Step 1 is 0 – 0.472, giving –0.472.

If the patient is pregnant, you would use the equation 1 – 0.472, giving 0.528.

Then you carry out all the other steps exactly as described.»

» It’s even a little trickier for multivalued categories (such as different clinical

centers) because you have to code each of these variables as a set of indicator

variables.

Estimating the Required Sample Size

for a Survival Regression

Note: Elsewhere in this chapter, we use the word power in its algebraic sense, such

as in x 2 is x to the power of 2. But in this section, we use power in its statistical

sense to mean the probability of getting a statistically significant result when

performing a statistical test.

Except for straight-line regression discussed in Chapter 16, sample-size calcula-

tions for regression analysis tend not to be straightforward. If you find software

that will calculate sample-size estimates for survival regression, it often asks for

inputs you don’t have.

Very often, sample-size estimates for studies that use regression methods are

based on simpler analytical methods. We recommend that when you’re planning

a study that will be analyzed using PH regression, you base your sample-size esti-

mate on the simpler log-rank test, described in Chapter 22. The free PS program

handles these calculations very well.